Rank | Count | Beginning |
---|---|---|
12341 | 3074 | थ्व |
17083 | 842 | नेपायागु |
1619 | 654 | ५%, |
21438 | 633 | भारतयागु |
4755 | 332 | इ॰ |
23267 | 250 | मिजंतेगु |
27849 | 217 | समुद्र |
27622 | 116 | सन् |
24735 | 108 | राग |
12115 | 56 | थुकिया |
17932 | 45 | नेपाल |
11983 | 43 | थन्यागु |
3316 | 38 | अथे |
11911 | 37 | थनया |
25914 | 36 | व |
12203 | 35 | थुकिलि |
10069 | 31 | छुं |
26112 | 30 | वय्कःया |
6205 | 27 | ऒरु |
9967 | 27 | छगू |
11916 | 27 | थनयागु |
25722 | 27 | लेक |
26055 | 27 | वय्कलं |
5388 | 25 | इस्ट |
12138 | 25 | थुकियात |
16670 | 24 | नापं |
16672 | 24 | नापं, |
11337 | 23 | तर |
9196 | 22 | चक |
19725 | 22 | प्राचीन |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV